Search CORE

790 research outputs found

Towards Data-Driven Autonomics in Data Centers

Author: Babaoglu Ozalp
Sîrbu Alina
Publication venue
Publication date: 01/01/2015
Field of study

Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using generated data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating a predictive model for node failures. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing machine state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if machines will fail in a future 24-hour window. Our evaluation reveals that if we limit false positive rates to 5%, we can achieve true positive rates between 27% and 88% with precision varying between 50% and 72%. We discuss the practicality of including our predictive model as the central component of a data-driven autonomic manager and operating it on-line with live data streams (rather than off-line on data logs). All of the scripts used for BigQuery and classification analyses are publicly available from the authors' website.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Towards Operator-less Data Centers Through Data-Driven, Predictive, Proactive Autonomics

Author: Babaoglu Ozalp
Sîrbu Alina
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using live data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating predictive models for node failures. Our results support the practicality of a data-driven approach by showing the effectiveness of predictive models based on data found in typical data center logs. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing node state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if nodes will fail in a future 24-hour window. Our evaluation reveals that if we limit false positive rates to 5%, we can achieve true positive rates between 27% and 88% with precision varying between 50% and 72%.This level of performance allows us to recover large fraction of jobs' executions (by redirecting them to other nodes when a failure of the present node is predicted) that would otherwise have been wasted due to failures. [...

arXiv.org e-Print Archive

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Prediction of Fatalities in Vehicle Collisions in Canada

Author: Babaoglu Ceni
Babaoglu Liza
Publication venue: 'Faculty of Transport and Traffic Sciences'
Publication date: 01/01/2021
Field of study

Traffic collisions affect millions around the world and are the leading cause of death for children and young adults. Thus, Canada’s road safety plan is to reduce collision injuries and fatalities with a vision of making the safest roads in the world. We aim to predict fatalities of collisions on Canadian roads, and to discover causation of fatalities through exploratory data analysis and machine learning techniques. We analyse the vehicle collisions from Canada’s National Collision Database (1999–2017.) Through data mining methodologies, we investigate association rules and key contributing factors that lead to fatalities. Then, we propose two supervised learning classification models, Lasso Regression and XGBoost, to predict fatalities. Our analysis shows the deadliness of head-on collisions, especially in non-intersection areas with lacking traffic control systems. We also reveal that most collision fatalities occur in non-extreme weather and road conditions. Our prediction models show that the best classifier of fatalities is XGBoost with 83% accuracy. Its most important features are “collision configuration” and “used safety devices” elements, outnumbering attributes such as vehicle year, collision time, age, or sex of the individual. Our exploratory and predictive analysis reveal the importance of road design and traffic safety education

Directory of Open Access Journals

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

A Big Data Analyzer for Large Trace Logs

Author: Babaoglu Ozalp
Balliu Alkida
Marzolla Moreno
Olivetti Dennis
Sîrbu Alina
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/09/2015
Field of study

Current generation of Internet-based services are typically hosted on large data centers that take the form of warehouse-size structures housing tens of thousands of servers. Continued availability of a modern data center is the result of a complex orchestration among many internal and external actors including computing hardware, multiple layers of intricate software, networking and storage devices, electrical power and cooling plants. During the course of their operation, many of these components produce large amounts of data in the form of event and error logs that are essential not only for identifying and resolving problems but also for improving data center efficiency and management. Most of these activities would benefit significantly from data analytics techniques to exploit hidden statistical patterns and correlations that may be present in the data. The sheer volume of data to be analyzed makes uncovering these correlations and patterns a challenging task. This paper presents BiDAl, a prototype Java tool for log-data analysis that incorporates several Big Data technologies in order to simplify the task of extracting information from data traces produced by large clusters and server farms. BiDAl provides the user with several analysis languages (SQL, R and Hadoop MapReduce) and storage backends (HDFS and SQLite) that can be freely mixed and matched so that a custom tool for a specific task can be easily constructed. BiDAl has a modular architecture so that it can be extended with other backends and analysis languages in the future. In this paper we present the design of BiDAl and describe our experience using it to analyze publicly-available traces from Google data clusters, with the goal of building a realistic model of a complex data center.Comment: 26 pages, 10 figure

arXiv.org e-Print Archive

CiteSeerX

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Self-* Properties through Gossiping

Author: Babaoglu Ozalp
Jelasity Márk
Publication venue: 'The Royal Society'
Publication date: 01/01/2008
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

T-MAN: gossip-based overlay topology management

Author: Jelasity Márk
Babaoglu Ozalp
Publication venue: Springer
Publication date: 01/01/1982
Field of study

Syftet med specialarbetet är att presentera genren allåldersböcker samt att ge litteraturtips till den intresserade läsaren. Med en kortfattad definition innebär begreppet allåldersböcker böcker som kan läsas med lika stor behållning av såväl barn och ungdom som vuxna läsare. Specialarbetet inleds med utdrag ur olika intervjuer som jag gjort med fackmänniskor i bokvärlden. Sedan följer ett fyrtiotal annotationer som jag skrivit efter att ha läst dessa allåldersböcker. Bokurvalet har gjorts efter rekommendationer av ovannämnda personer. Slutligen följer en förteckning över icke-annoterad allålderslitteratur som valts ut enligt samma principer som de övriga verken

Crossref

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

University of Borås

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Global existence and blow-up of solutions for a general class of doubly dispersive nonlocal nonlinear wave equations

Author: Babaoglu Ceni
Babaoğlu Ceni
Erbay Husnu A.
Erbay Hüsnü A.
Erkip Albert
Publication venue
Publication date: 05/08/2012
Field of study

This study deals with the analysis of the Cauchy problem of a general class of nonlocal nonlinear equations modeling the bi-directional propagation of dispersive waves in various contexts. The nonlocal nature of the problem is reflected by two different elliptic pseudodifferential operators acting on linear and nonlinear functions of the dependent variable, respectively. The well-known doubly dispersive nonlinear wave equation that incorporates two types of dispersive effects originated from two different dispersion operators falls into the category studied here. The class of nonlocal nonlinear wave equations also covers a variety of well-known wave equations such as various forms of the Boussinesq equation. Local existence of solutions of the Cauchy problem with initial data in suitable Sobolev spaces is proven and the conditions for global existence and finite-time blow-up of solutions are established.Comment: 17 page

arXiv.org e-Print Archive

eResearch@Ozyegin

Sabanci University Research Database

Predicting system-level power for a hybrid supercomputer

Author: Babaoglu Ozalp
SIRBU ALINA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

For current High Performance Computing systems to scale towards the holy grail of ExaFLOP performance, their power consumption has to be reduced by at least one order of magnitude. This goal can be achieved only through a combination of hardware and software advances. Being able to model and accurately predict the power consumption of large computational systems is necessary for software-level innovations such as proactive and power-aware scheduling, resource allocation and fault tolerance techniques. In this paper we present a 2-layer model of power consumption for a hybrid supercomputer (which held the top spot of the Green500 list on July 2013) that combines CPU, GPU and MIC technologies to achieve higher energy efficiency. Our model takes as input workload information - the number and location of resources that are used by each job at a certain time - and calculates the resulting system-level power consumption. When jobs are submitted to the system, the workload configuration can be foreseen based on the scheduler policies, and our model can then be applied to predict the ensuing system-level power consumption. Additionally, alternative workload configurations can be evaluated from a power perspective and more efficient ones can be selected. Applications of the model include not only power-aware scheduling but also prediction of anomalous behavior

arXiv.org e-Print Archive

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna